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Abstract 

Recent advances in light microscopy have spawned new research frontiers in microbiology 
by working around the diffraction barrier and allowing for the observation of nanometric bi¬ 
ological structures. Microrheology is the study of the properties of complex fluids, such as 
those found in biology, through the dynamics of small embedded particles, typically latex 
beads. Statistics based on the recorded sample paths are then used by biophysicists to infer 
rheological properties of the fluid. In the biophysical literature, the main statistic for charac¬ 
terizing diffusivity is the so-named mean squared displacement (MSD) of the tracer particles. 
Notwithstanding the central role played by the MSD, its asymptotic distribution in different 
cases has not yet been established. In this paper, we tackle this problem. We take a pathwise 
approach and assume that the particle movement undergoes a Gaussian, stationary-increment 
stochastic process. We show that as the sample and the increment lag sizes go to infinity, the 
MSD displays Gaussian or non-Gaussian limiting distributions, as well as distinct convergence 
rates, depending on the diffusion exponent parameter. We illustrate our results analytically 
and computationally based on fractional Brownian motion and the (integrated) fractional 
Ornstein-Uhlenbeck process. 


1 Introduction 

Abbe’s diffraction limit stood for more than a hundred years as a barrier for light microscopy. The 
resolution limit of roughly 250nm (Inm = 10“®m) is large compared to organelles in biological 
cells and most nanostructures. However, in the last twenty years advances in light microscopy 
technology have spawned new research frontiers by allowing for the observation of nanobiological 
phenomena in vitro and in vivo up to resolutions of 10-20nm (e.g., Hell (2003, 2008), Betzing et al 
(2006), Rust et al (2006), Hess et al (2006), Westphal et al (2008), Berning et al (2012), Jones et 
al (2011), Huang et al (2013)). Microrheology is a rapidly expanding subfield of nanobiophysics. 
It consists of the study of the properties of complex fluids, such as those found in biology, through 
the dynamics of small embedded particles, typically latex beads, tracked and recorded by means 
of new light microscopy technology. Microrheology is currently the dominant technique in the 
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study of the physical properties of complex biofluids, of the rheological properties of membranes 
or the cytoplasm of cells, or of the entire cell (Mason and Weitz (1995), Wirtz (2009); see Didier 
et al. (2012) for a broad description of the statistical challenges involved). 

The characterization of the diffusive behavior of nanometric particles embedded in viscous, 
Newtonian fluids is now well-understood both physically and probabilistically. However, in com¬ 
plex fluids particles are expected to display non-classical, or anomalous, behavior due to the 
viscoelasticity of the fluid. As in the early analysis of diffusion, biophysicists dedicate a great deal 
of attention to the average distance traveled by a particle, namely, the mean squared displacement 
IJ, 2 {t) ;= EX‘^{t) (MSB), where X{t) denotes the position of the particle at instant t. For a given 
time window I, we can express the “local” MSB in the form 

EX^{t) ^ ee, a,9>0, tel, (1.1) 

where the parameters 6 and a are called the diffusivity coefficient and diffusion exponent, respec¬ 
tively. The microparticle is said to be sub-, super- or simply diffusive if the a is less than, greater 
than, or equal to 1, respectively. When a ^ 1, the diffusion is commonly named anomalous (see 
O’Malley and Cushman (2012) for a different perspective). The interval I in (1.1) can be of finite 
length or open-ended, according to the demands of physical analysis. In the former case, (1.1) 
expresses transient MSB behavior, as observed in polymer physics (Rubinstein and Colby (2003), 
Kremer and Crest (1990)). Alternatively, (1.1) describes the asymptotic MSB behavior (see the 
relation (2.13) for the accurate mathematical depiction of (1.1) in the context of this paper). 

Statistical evidence of anomalous diffusion has turned up in several contexts, including biod¬ 
iffusion (Valentine et al. (2001)), blinking quantum dots (Brokmann et al. (2003), Margolin and 
Barkai (2005)) and fluorescence studies in single-protein molecules (Kou and Xie (2004), Kou 
(2008)). The dominant statistical technique in the biophysical literature for estimating the dif¬ 
fusion exponent a is what we will call the sample mean squared displacement (MSB). Suppose 
that a microrheological experiment generates a tracer bead sample path with observations X(Aj), 
j = 0,1,..., n, where A G N stands for the sampling rate. The pathwise statistic 

^ n—h 

M^h) :=-- ^ (V(A(j + h)) - X{Aj)f (1.2) 

i=i 

is the MSB at h, i.e., the statistical counterpart of /U 2 (-) (for notational simplicity, we do not 
display the dependency of 7 Z 2 on n). Under (1.1), and assuming stationary increments, for m lag 
values hi < ... < hm and m n one hopes for ergodicity, namely, 

Jl 2 {^hk) ^ EX‘^{Ahk), k = l,...,m. (1.3) 

One then generates (log0, a) by means of the linear regression 

log 7 Z 2 (A/ifc) = log^-f alog(A/ifc)-f efc, A: = l,...,m, (1.4) 

possibly over several independent particle paths, where {Ek}k=i,...,m is a random vector with an 
unspecified distribution. Plots of MSB curves as a function of the lag h, sometimes on a log-log 
scale (see Figure 1), are widely reported as part of diffusion analysis (e.g., Valentine et al. (2001), 
Suh et al. (2005), Matsui et al (2006), Lai et al (2009), Lieleg et al. (2010)). The choice of lags 
hi,..., hm reflects the analyst’s visual perception of the range where the slope of the MSB curves 
stabilize and thus indicate the true diffusive regime and power law (1.1). 
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Figure 1: MSD plots, each with 50 paths of size 2^^ from an ifOU process with parameters A = 1 
and H (see Definition 2.1). Left: H = 1/4 (subdiffusive: a = 1/2). Right: H = 1/2 (diffusive: 
a = 1 ). 


The stochastic properties of the MSD depend on the underlying class of stochastic processes. 
In the review paper Meroz and Sokolov (2015), the authors classify physical models for subdif¬ 
fusive behavior according to whether one assumes the presence of binding-unbinding events, of 
geometrical constraints on the particle’s movement, or the medium is viscoelastic. This leads 
to three popular families of stochastic processes, respectively, those of continuous time random 
walks (Metzler and Klafter (2000), Meerschaert and Scheffler (2004)), of random walks on fractals 
(Havlin and Ben-Avraham (1987)), and of the celebrated fractional Brownian motion (fBm; see 
Example 2.1). In this paper, we focus on the latter family, more precisely, that of fractional, 
stationary increment processes (Barkai et al. (2012), Lysy et al. (2014)). 

The ergodicity of the MSD moments was established in Deng and Barkai (2009) for various 
families of fractional processes (see also Sokolov (2008), Metzler et al. (2009), Jeon and Metzler 
(2010), Burov et al. (2011), Jeon et al. (2013), Sandev et al. (2012)). Finite sample approximations 
to the distribution of the MSD under Gaussianity are provided in Grebenkov (2011a) (see also 
Qian et al. (1991), Grebenkov (20116), Boyer et al. (2012), Andreanov and Grebenkov (2012), 
Nandi et al. (2012), Boyer et al. (2013)). Nevertheless, so far MSD-based analysis of tracking 
data has missed one essential feature of statistical methods, namely, the limiting distribution of 
the random vector 

1 ■ ■ ■ 1 ■ ( 1 - 5 ) 

The purpose of this paper is to fill this gap. We work under the assumption that the particle un¬ 
dergoes a Gaussian process whose stationary increments display a covariance function 7 satisfying 
a decay condition of the type 


lh{z)l ,2 ~ Cz" 00 , ( 1 . 6 ) 

for some real constant C, where hi, /12 represent lag sizes (see the expressions (2.10) and (2.14) 
for precise definitions, notation and statements). As in the particular case of fBm, such a particle 
is not constrained by boundaries (such as those found in a cell) or a potential. 
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We assume the availability of just one sample path. This models the situation in which the 
biophysical samples are physically heterogeneous (Valentine et ah (2001), Dawson et ah (2003), 
Monnier et al. (2012)). Complex biomaterials such as mucus, or simulants such as agarose and 
hyaluronic acid, are expected to_be heterogeneous due to the unequal distribution of chains of 
polymers. Since the multiple MSD averages are formed from the same particle path, then even 
when 1^-2 — /ill is large the associated coefficients (1.2) still display strong correlation (Monnier 
et al. (2012)). Our main result (Theorem 3.1) shows that this yields limiting distributions and 
convergence rates that depend on the diffusion exponent range according to a familiar trichotomy 
in the literature on fractional processes. When 0 < a < 3/2, the asymptotic distribution is 
Gaussian, though the case a = 3/2 demands a non-standard convergence rate. When 3/2 < a < 2 
the convergence rate depends on the diffusion exponent and the asymptotic distribution is non- 
Gaussian; this reffects the classical results by M. Rosenblatt (1961) and M. Taqqu (1975). This 
type of result is well-known for hxed sequences of Gaussian, stationary random variables, or 
for p-variations of shrinking interval size of Gaussian processes (Guyon and Leon (1989), Peltier 
and Vehel (1994), Hosking (1996), Bardet (2000), Buchmann and Ghan (2009)). By contrast, 
we consider the MSD statistics in the same format found in the biophysical literature, namely, 
we take the lag limit /i —)■ oo. Moreover, whereas the related literature on Hermitian processes 
and random fields often makes use of Wiener-Ito chaos expansions and Malliavin calculus (e.g., 
Nourdin et al. (2010), Reveillac et al. (2012)), in this work we develop our results in the style of 
Rosenblatt’s classical arguments as to make the statements and techniques more readily available 
to the interested reader with a biophysical background. The asymptotic distributions provided 
allow for a new statistical perspective on the many numerical-experimental results reported by 
the biophysical community, and make it possible to mathematically compare MSD-based analysis 
with that based on other candidate statistical techniques, e.g., in the Fourier and wavelet domains. 

It should be stressed that we do not assume exact self-similarity (see relation (2.3)). Dispensing 
with the latter property is important because it is often of interest to start from a Newtonian 
instance, such as the generalized Langevin equation (GLE; e.g., Lysy et al. (2014), p.6), to arrive 
at an anomalous diffusion model that displays non-fractional short range behavior. In particular, 
we show that our results encompass the stationary-increment process induced by a fractional 
Ornstein-Uhlenbeck (fOU) velocity process (see Definition 2.1). The latter can be regarded as a 
spectrally simplified model for fractional instances of the GLE (see (2.21)). 

The paper is divided as follows. Section 2 contains most definitions and the assumptions used 
throughout the paper. We also shed light on the proposed assumptions by showing that^they 
imply the properties (1.1) and (1.6). In Section 3, we state and discuss weak limits for the MSD. 
Furthermore, Monte Carlo experiments are used to illustrate the Gaussian or non-Gaussian nature 
of the MSD distribution, and to study the quality of the asymptotic approximation. All proofs 
can be found in the Appendix. 

2 Preliminaries and assumptions 

All through the paper, C is used in bounds to denote a constant that does not depend on the 
sample size n, and which may change from one line to another. For two sequences of real numbers 
{a^jneN) {/^njneN, the expression On ~ means that ^ 1 as n —)■ oo. 

Recall that a stochastic process X is said to have stationary increments when 
{X{t + h) — A(/i)}tgR has the same finite-dimensional distributions for any time shift 
/i G M. The stochastic process in (1.1) is assumed to satisfy the following condition. 
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Assumption (A1): X = {A(t)}tg]R is a Gaussian, stationary-increment process with harmoniz- 
able representation 

where a G (0,2), Ca / 0, B{dx) is a C-valued Brownian measure such that B{—dx) = B{dx), 

EB{dx)B{dx') = 0 (x ^ x'), and E\B{dx)\‘^ = dx. The function s(x) is a bounded and complex- 
valned function with |s(0)|^ = 1, and 

I |s(x)|^ - 1| < C'ox'^°, xG(-£o,£o), (2.2) 


for constants > 0. 

In particnlar, the representation (2.1) implies that EX{t) = 0, t G M. However, this is inconse¬ 
quential for modeling, since one can always assume that a single diffusing particle starts at zero. 
In turn, the condition (2.2) is mild (c.f. Moulines et al. (2007), p.302, relation (4)) and plays a 
technical role in the proof of Proposition 2.1 below. 

Example 2.1 FBm is the only Gaussian, self-similar, stationary-increment process (Taqqu 
(2003), Proposition 2.3). The self-similarity of the fBm Bh = {BHit)}t£R means that, for a 
Hurst parameter 0 < H < 1, the scaling relation 

{BH{ct)}teR = {c^BH{t)}teR, O 0, (2.3) 

is satished. FBm has mean zero, and by Ganssianity, it is characterized by its closed-form covari¬ 
ance function 

EBH{s)BH{t) = y- It - s, t G M. (2.4) 

In particnlar, when = 1, we call Bh a standard fBm. Moreover, by taking s = t in (2.4), the 
expression (1.1) holds at all t as an equality with 

a = 2H. (2.5) 

When H is less than, greater than or equal to 1 /2, fBm is sub-, super- or simply diffusive (Brownian 
motion), respectively. The harmonizable representation of a standard fBm is given by 

X{t) = Ch / I ,H-i/2 ^(dx), (2.6) 

Jr 'f'X |x| ! 

where 

Ch = v^7FiHT(2H>h^(H^ (2.7) 

(see Taqqu (2003), p.31, expression (9.8)). Thus, fBm satisfies (Al) with Ca = Ch and s{x) = 1. 

Let A = 1 in (1.2). We assume that an experiment prodnced one seqnential sample 
Al,..., Xn, n G N, of observations from the stochastic process A. For h G N, let 

\hk •— U}kh}k=l,...,m (2.8) 

be distinct integer-valned increment sizes, where hm < n — 1, wi < W 2 < ■ ■ ■ < Wm and m < n — 1. 
We can define an associated vector of increments 

Y = (Yi(hi),..., Yn-/^i(/ii);^i(/i2),..., Y^_h,{h2 )]...; Yi(h™),..., Y^_h^{hm)V, (2.9) 
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where 

Yi{hk) := X{i + hk) - X{i), /ifcGNU{0}, k = l,...,m. 
Since X is a stationary-increment process, then the cross product 


EYj+,{hk,)Yj{hk,) = E{X{j + z + hk,)- X{j + z)){X{j + hk,) - X{j)) 

= E{X{z + hk,) - X{z)){X{hk,) - X(0)) = EY,(hk,)Yo(hk,) 

is not a function of j. Denote the covariance matrix of the increments by Sy(/ii, • • •, hm), where 
an entry has the form 

lh{z)k,,k2 = EYj+z{hk,)Yj{hk,), (2.10) 

for ki,k 2 = 1,... ,m. Note that (2.10) satisfies the symmetry relation 


'Jhi Z -|- hk, hk,)ki,k2 'yh(.z)ki,k2^ Z G Z. 

The self-similarity of fBm (see (2.3)) makes the asymptotic distribution of the MSD much simpler 
to establish (see Peltier and Vehel (1994), Proposition 4.2). Since we do not assume self-similarity, 
as in the biophysical literature we need to make the size of the lags themselves go to infinity, though 
slower than the sample size n. This mathematically expresses what biophysicists do in practice: h 
has to be large enough for the MSD regime to become linear, but at the same time cannot be too 
large because of the increased variance of the MSD. This is illustrated in Figure 1 and accurately 
described in assumption (A2), stated next. 

Assumption (A2): for h = h{n) G N U {0}, n G N, 


h{n) log^(n) 


+ 


n 


n 


/j(j^)l-|-(5/2 


0 , 


n 


oo, 


( 2 . 11 ) 


where 

(5 = min(a/2, (5o/2) (2-12) 

(see (2.1) and (2.2) for the definitions of a and 5o). 

As anticipated in the Introduction, the expressions (2.13) and (2.14) in the following proposi¬ 
tion give exact mathematical meaning to the heuristic properties (1.1) and (1.6). 


Proposition 2.1 Suppose the assumptions (Al) and (A2) hold. 


(i) Then, there is a constant 9 > 0 such that 


EX^{h) 


< Ch-\ 


n oo, 


where S > 0 is given by (2.12); 


(2.13) 


(a) moreover, for any ki,k 2 = 1, ...,m and ^h{z)ki,k2 (2-10), 



lh{z 

)fciT2 = kl" ^h‘^Wk,Wk2{T + 9{z,h)k,,k2}, n^oo, 

(2.14) 



hm + 'i- < \z\ < n. 

(2.15) 

where 


/ Ca o:(a - 1) 

" = = 2 • 

(2.16) 

Ch is 

given by (2.7), 

and the residual function g{-,-)k,,k 2 satisfies 




/ h \ ^ 

(2.17) 



\g{z,h)k,,k 2 \ < ^{ty) ’ 0 <h< | 2 ;|. 
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Besides fBm, another model for anomalous diffusion used in this work is what we call the 
integrated fractional Ornstein-Uhlenbeck (ifOU) process. One major difference between fBm and 
the ifOU process is that the latter is not exactly self-similar. The ifOU is an example of a fractional 
process whose MSD asymptotics are naturally studied under the assumption (A2). 

To define the ifOU process, recall that the fractional Ornstein-Uhlenbeck process (fOU) is the 
a.s. continuous solution to the fBm-driven Langevin equation 

dV{t) =—XV{t)dt + adEnit), t>0, A>0, 0<H<1 (2-18) 

(Rao (2010), p.78). The a.s. continuous process 

V{t) = a j t > 0, 

solves (2.18) with initial condition U(0) defined by the same integral. When H = 1/2, the 
solution is the classical Ornstein-Uhlenbeck process. We are interested in the stationary-increment 
counterpart of the fOU process, as put forward in the next definition. 


Definition 2.1 Given a fOU process {U(t)}t>o, the associated ifOU process is given by 


X{t) = [ V{s)ds, t > 0. 

Jo 


(2.19) 


The integrand in (2.19) is a version of V with continuous paths (see Didier and Fricks (2014), 
p.719. Lemma A.4). 


The ifOU is a simple parametric model for anomalous diffusion. This can be seen in the 
Fourier domain, based on the harmonizable representation 

p xtx 11 1 

X{t) = aVr(2F + l)sin( 7 rF) / " , B{dx). (2.20) 

The spectral density 

/x(x) = a^V(2B -F 1) sin( 7 riL) ] ^ , x G M\{0}, (2.21) 

A + X \X\ 

exhibits the short range dependence term (A^ + x‘^)~^ besides the fractional term with a 

tuning parameter A. 

Let H G (0,l)\{l/2}. By (2.21) and dominated convergence, the covariance function 7 ( 5 ) = 
Cov(U(t), V{t + s)) of V is continuous. Furthermore, in Cheridito et al. (2003), p. 8 , it is shown 
that 

2 N 2n—l 

7(s) = yY1 ( n (2.22) 

n=l k=0 

for an arbitrary G N, where the remainder is taken with respect to s —)■ 00 . Therefore, 

|£'X(ti)X(t2)| < f f l^isi - S2)\dsids2 < OO, (2.23) 

Jo Jo 

whence the integral (2.19) is also well-defined in the mean squared sense (see Cramer and Lead- 
better (1967), p. 86 ). By (2.20), like fBm the ifOU also satisfies (Al) and the conclusions of 
Proposition 2.1 apply, where the relation between a and H is again given by (2.5). Note that 
when the ifOU is simply diffusive {H = 1/2, or a = 1), the expression (2.14) holds with r = 0. 
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3 The asymptotic distribution of MSD—based anomalous diffu¬ 
sion parameter estimators 

The following theorem is the main result of this paper. It gives the asymptotic distribution of 
a random vector of MSD entries at different lag values, according to subranges of the diffusion 
exponent a. For the theorem, recall that in the notation (2.9), the MSD at a given lag value is 
given by 

Nk 

Mhk) = (3.1) 

i=l 

where 

Ni^:=n — hk, k = l,...,m. (3.2) 


Theorem 3.1 Suppose the assumptions (Al) and (A2) hold. Let m G N 6e the chosen number 
o/MSD lag values, and let Nk be as in (3.2), k = 1,... ,m. Then, 

. 

where 

( 0 < a < 3/2 : r/(n) = y/n, ({h) = 

< a = 3/2 : r]{n) = ^Jn log(n), C,{h) = (3-4) 

[ 3/2 < a < 2 : r/(n) = C,{h) = h?. 

In (3.3), 

(i) i/0 < a < 3/2, then Z ~ N{0,'E), where the entry ki,k 2 of the matrix S = S(a) is given 
by 


Yki ,k2 


2w 


-a-l/2 

fcl 


W 


- 0 - 1/2 

k2 



/Ci,/C2 = 1, . . . ,m. 


G{y;wki,Wk2) 

and Ch is given in (2.7); 


^2 (e™'=i2' - l)(e-™"22/ - 1) 


(3.5) 

(3.6) 


(ii) if a = 3/2, then Z ~ N{0, S), where the entry ki, k 2 of the matrix S = S(a) is given by 


5^fci,fc2=4r^> ki,k 2 = l,...,m. 


(3.7) 


(Hi) if 3(2 < a < 2, Z follows a multivariate Rosenblatt-type distribution whose characteristic 
function is 

, ,, r 1 ^ [2iT yr. tkY 

</z(t) = exp - ^, 


s=2 


(3.8) 


where, for s>2. 


1 D 


= Cs(a) = 


|xi —X 2 |“ ^1^2 —X 3 I" —xi|" ‘^dxidx2-■ ■ dxs- (3.9) 


0 JO JO 


c. 



Remark 3.1 It is worthwhile recalling the fact that the constant c* in (3.9) is, indeed, finite. 
Indeed, by an application of the Cauchy-Schwarz inequality, 


Cs < 


/ 7 ‘ 


Xl — X2\‘^°‘ dxidx2 


s/2 


(2a-3)(a-2) 


s/2 


(3.10) 


Theorem 3.1 allows us to develop the asymptotic distribution of the MSD-based least square 
estimator of the diffusivity coefficient and diffusion exponent. Recast the (pathwise) system (1.4) 
as the regression model 


where 


/ log/i2(^i) 


Qn — 


, 13 = 


log0 

a 


\ log//2(/im’ 

and e has a distribution to be determined. We will denote by 

X ■= {MlMnXM^Qn = ((l5r0)n, S 


n’t 

(1 

log(/ri) \ 

(3.11) 

, Mn = 

(, 

^ ■ 

log{hm) j 

(3.12) 


(3.13) 


the estimator generated by the ordinary least squares solution to the system (3.11). The next 
corollary describes the asymptotic distribution of the least squares estimator (3.13). 


Corollary 3.1 Suppose the assumptions (Al) and (A2) hold. Then, as n =)■ oo, 




((log6 ')n-log0) ^ dX U'^ 


r]{n)C{h) 


On — a 


-U^ 


AZ. 


In (3.14), 

A = A{0,a) = diag{C{wi)/{ew^),...X{'Wm)/{Ow^)), 
r]{ ), C(') o-nd Z are as in Theorem 3.1, and 


U^ = 


'^log{Wk/wi), ...,'^log{Wk/Wr, 


k=l 


k=l 


with eonstant 


c,i, = m 


'^log^iwk) - 


(3.14) 

(3.15) 

(3.16) 

(3.17) 


k=l 


k=l 


Corollary 3.1 states that the limiting distributions for /3„ are qualitatively distinct as a func¬ 
tion of the underlying diffusion exponent a. In particular, a non-Gaussian limit appears in the 
super diffusive range 3/2 < a < 2. Though probably of little interest in the modeling of viscoelas¬ 
tic diffusion, superdiffusion appears in many other applications (e.g., Brokmann et al. (2003), 
Margolin and Barkai (2005); note that in these papers the processes are viewed as following Levy 
walk-type dynamics). 

Remark 3.2 Corollary 3.1 can be directly used in the construction of confidence intervals, at 
least starting from knowledge that a lies in one of the subregions (0,3/2) or (3/2,2) of the 
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parameter space. To fix ideas, consider the parameter a; the ensuing argument can be easily 
adapted for 9. By Corollary 3.1, an — a \s asymptotically equivalent to 


v{n)C{h) 

nh^ 


U^AZ 


v{n)C,{h) 

nh°^ 


a{9, a)Z{a), 


where a{9,a) is a smooth function of {9, a) defined as to make Z{a) a standardized random 
variable. When 0 < a < 3/2, this is clearly possible, since the limiting distribution is Gaussian 
and —U'^A and S are smooth functions of 9 and a (see (3.5), (B.6) and (3.16)). When 3/2 < a <2, 
first note that Z in Theorem 3.1 is a rank 1 random vector. Indeed, recall that the characteristic 
function of a standardized (mean zero, variance one) Rosenblatt random variable is given by 


</>a(i) = exp|i^(2RV’(a))'' —}, 

^ s=2 ^ 

(see Veillette and Taqqu (2013), expression (4)). Let Zi be a Rosenblatt random variable with 
normalizing constant r (i.e., with the latter in place of '0(«) in (3.18)), and let Z 2 = ... = Zm '■= 
Z\. Then, = exp{^ So, denote by Z the 

limiting Rosenblatt random variable obtained in (2.16). Then, Z^ := {'4>{a)jT)Z is standardized 
and, by (2.16) and (3.18), the coefficient Tp{a)lT also depends smoothly on a. 

So, the consistency of 9n and ctn for 9 and a, respectively, implies that of a{9n, Sn) for a{9, a). 
When 0 < a < 3/2, Z{a) ~ fV(0,1), which is independent of a. Then, an approximate 100(1—^)% 
confidence interval for a is simply 

/ //l\l/2 ^ ^ \ 

\n) j • 


When 3/2 < a < 2, by (3.10) and the dominated convergence theorem, the characteristic function 
4>a{t) =■ </>(L a) of Za is continuous with respect to a for t around the origin. Now consider the 
function cj){z, a) with domain in 2 ; extended to a vicinity of the origin of C. By applying Theorem 
7.1.1 in Lukacs (1970) and the uniqueness of analytic continuation, we obtain that (f>{t,a) is 
continuous with respect to a for all t G M. Consequently, the cumulative distribution function 
and the quantile function z^(a) are also continuous with respect to a, whence z^{an) z^(a) for 
<2 G (0,1). So, an approximate 100(1 — .^)% confidence interval for a is 




2 <y.n ^ ^ ^ ^ 

a{9m an)z^l2{an),an + 



2 — CXn 


(j{9n, an)zi_^l 2 {ar 


(3.19) 


(see Veillette and Taqqu (2013) for numerical results on the quantiles of the Rosenblatt distri¬ 
bution). In (3.19), we are using the fact (^)^~“/(^)^~“ —^ 1, which can be verified by taking 
logs. 

To study the finite-sample properties of MSD-based estimation and the quality of the asymp¬ 
totic approximations described in Theorem 3.1 and Corollary 3.1, Monte Carlo experiments were 
conducted based on sub- and superdiffusiw instances of fBm and ifOU processes. Figure 2 displays 
the Monte Carlo distributions of the MSD, i.e., histograms and best Gaussian fit. The plots re¬ 
flect the results described in Theorem 3.1: for the subdiffusive Hurst parameter value of 77 = 0.25 
(a = 0.5), the distribution is distinctively Gaussian; by contrast, for the strongly super diffusive 
value H = 0.9 {a = 1.8), the Rosenblatt-like attractor skews the finite-sample distribution. More¬ 
over, Table 1 displays results for H = q;^^^/ 2 under a fBm. The simulations encompass two 
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Figure 2: Monte Carlo distribution of //2(1) 5,000 paths of size 2^*^. Top plots, fBm (left: 

a = 0.5; right: a = 1.8. Bottom plots, ifOU (left: a = 0.5; right: a = 1.8. In both cases, A = 1). 


distinct situations. In the first one, we follow the common practice in microrheology of taking a 
large number of consecutive lag values such as /imin, ^min + 1, • • •, ^max — 1, ^max) where /imin = 2 
and /imax = 128. In the second one, we pick on^ two lag values, namely hi = 2 and /i 2 = 128. 
The results show that dropping most of the MSDs has little impact on the performance of /3„. 
Moreover, simulation studies not shown provide evidence that in both subdiffusive and superdif- 
fusive ranges {H = 0.25,0.5,0.75,0.9 or a = 0.5,1,1.5,1.8, respectively), a pairwise combination 
of two low lag values (such as /ii = 1 and /i 2 = 2) leads to the best statistical performance, as 
measured by the Monte Carlo mean squared error. 

As expected, though, the results for the ifOU are quite distinct. Since the latter is not 
exactly self-similar, the MSD curves display the asymptotic flattening effect revealed in Figure 
1 for H = 0.25 {a = 0.50). This is what drives biophysicists to use larger lags when modeling 
anomalous diffusion data in the first place. Table 2 illustrates this effect in the estimation of 
/3„ based on triples of consecutive lag values: the estimation bias decreases as the chosen lags 
increase. However, since the variance also increases with the lag, due to the smaller number of 
terms in 7^2 (^)i the choice of regression lags with the lowest mean squared error turns out to be 
[2^, 2®, 2”^]. A different phenomenon emerges in the superdiffusive range. Table 2 shows that for 
very high values of H or a, it may be optimal to use lower regression lag values. This is so because 
the bias for large values of 77 or a is very large. Though not displayed, the same issue appears 
under different parameter values in the superdiffusive range, and its cause is a matter for future 
investigation. 
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4 Conclusion 


In this paper, we establish the asymptotic distribution of the MSD-based estimator widely used 
in the biophysical literature on anomalous diffusion. We assume that the particle undergoes a 
Gaussian, stationary-increment process, and take a pathwise approach, i.e., only one particle 
path is available. Depending on the diffusion exponent of the underlying process, the MSD-based 
estimator has Gaussian or non-Gaussian limiting distribution, as well as different convergence 
rates. The asymptotic distributions provided allow for a new statistical perspective on the many 
numerical-experimental results reported by the biophysical community. We illustrate our results 
analytically and computationally based on fractional Brownian motion and the integrated frac¬ 
tional Ornstein-Uhlenbeck process. 


A Proofs for Section 2 


We first show a lemma that will be used in the proof of Proposition 2.1. 

Lemma A.l Suppose assumptions (Al) and (A2) hold. Then, forki,k 2 = l,...,m 

Jr |xr+^ 

where h,z gZ, h> and 5 = min(Q;/2, (Jo/2) > 0 (see (2.1) and (2.2)/. 


and hk as in 


(A.l) 


Table 1: Mean and s.d. of H for fBm; consecutive lags (2,... ,2®) vs two lags {hi = 2, /12 = 2^) 


H = 0.25 H = 0.90 

(q; = 0.5) (q; = 1.8) 


n 

Econs{H) 

sdcons(77) 

E2{H) 

sd2(i7) 

^cons(77) 

sdcons(77) 

'E2{H) 

sd2(i7) 


0.2363 

0.0723 

0.2376 

0.0527 

0.8380 

0.1067 

0.8459 

0.0890 

210 

0.2437 

0.0508 

0.2446 

0.0359 

0.8583 

0.0772 

0.8644 

0.0648 

2 II 

0.2468 

0.0361 

0.2476 

0.0253 

0.8715 

0.0574 

0.8751 

0.0507 

212 

0.2482 

0.0252 

0.2486 

0.0177 

0.8792 

0.0462 

0.8814 

0.0415 


Table 2: Mean and 

s.d. of H for ifOU 

: small lags 

vs large lags, path length=: 

H 

= 0.25 (a = 0.5) 

H 

= 0.90 (a = 1.8) 


lags 

E(ff) 

sd{H) 

E{H) 

sd(i7) 


0.3218 

0.0272 

0.8817 

0.0266 

[2^,25,26] 

0.2964 

0.0354 

0.8441 

0.0392 

[2^2®,2^] 

0.2756 

0.0488 

0.8315 

0.0506 

[ 26 , 2 ^ 2 ®] 

0.2591 

0.0742 

0.8239 

0.0704 
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Proof: By (2.1), we obtain a harmonizable representation for the size h increment process 
y^(/i), namely, 


Y,{h) = X{z + h)- X{z) = Ca 


zGZ. 


IX \x 


a/2-1/2' 


(A.2) 


Fix ki and k 2 - For notational simplicity, we will use the indices ki = 1, /c 2 = 2. From (A.2), after 
the change of variables xh = y, we can write the covariance between the increments Yz{hk^) and 
Yz{hk 2 ) as 




2 f 


\y\ 


a-\-l 


s{y/h)\^ dy. 


Now break up the left-hand side of the expression (A.l) into the sum 




+ / 

\y\<Yh J\y\>Yh- 


-Jii_-'{|,{9//.)|2 - l)dy 


= \h+h\, 


where Ii and I 2 denote the integrals over the domains {—y/h, y/h) and M.\{—y/h, y/h), respectively. 


Then, for h> e. 


-2 

ZZ c-O > 


l^ll < C 


< C 


r g*toiJ/ _ g—*1022/ _ 1 

\y\<Vh ' 

\^iwiy _ \^—iw2y _ 


Moreover, since s{x) is bounded. 


\y\<Vh 

\^iwiy _ \^—iw2y _ ]^| 


|y| 


a+l 


\s{y/h)\‘^ - l\dy 
■ \y/hf° dy 

dy = 


\h\ < C 


r AM 

■l\y\>Yh 

The expressions (A.3) and (A.4) yield (A.l). □ 


dy < Ch-^/^. 


(A.3) 


(A.4) 


Proof of Proposition 2.1: Fix ki and k 2 - For notational simplicity, we will use the indices 
ki = 1 and /c 2 = 2. To show (i), set z = 0 and hk^ = hk^ = h in (A.l). Then, 


EX\h) 


7/* ( 0 ) 1,1 

0h^ 


eh^ 


< Ch-\ h 


00 , 


where d = |e*^ — dy. 

We now show {ii). The proof draws upon conveniently rewriting the integral term in (A.l) 
based on the closed form expression for the covariance of a (standard) fBm. In fact, recall that 


.Ch/ — 2 —’ by (2.16). Then, 


lh{z)i,2 


< 


Ihiz] 




Jm 


Z\a-2 

- WlW2T[ -j 

^^ihix _ -^'^^^-ih2X _ 


dx 
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Note that 



l)(g-i/l2X 


X 


Q + l 



— W 1 W 2 T 


Z\a-2 

h) 


fi2 

/l“ 




—ih2X 


|Q;+1 


dx 


(A.5) 


(A. 6 ) 


is the expression for the covariance 'yh{z)i ^2 of a size h increment process Yz{h) (see (A. 2 )) formed 
from a standard fBm Bh- Pick \z\ > hm + 1- If a = 1, the integral (A. 6 ) is identically zero, 
by the independence of non-overlapping increments. Alternatively, when a 7 ^ 1 , the closed form 
(2.4) with cj^ = 1 allows us to rewrite (A. 6 ) as 


P|EBh(2 + hi)B„(h2) - EB„ii)B„{h2)\ = + All" - k + fci - /I2r - kl” + k - '>21”) 


— h 

2/i“ IV 


1 + tci- 


h 


-1 - 


1 -I- {wi - W 2 ) 


h 


-1 + 


I - W2- 


h 


-or 


(A.7) 


Let f{x) = x“. Based on second order Taylor expansions of / around 1, we can recast the 
expression (A.7) as 


\z\°‘ f a{a — 1) fh 

2 Vl 



{wi - W 2 )‘^ +w\]+0 



,A.s) 

Therefore, based on (A. 8 ) (which also encompasses the case a = 1) and (A.l), the expression 
(A.5) can be further bounded by C'[(^)“~^ -|- h~^]. As a consequence, we arrive at 


lh{z)l,2 


< C 


h 


+ 




min(l,Q:) 



(5 


where the last two inequalities result from (2.11) and (2.15). Setting g{z,h) = t 

yields (2.14). □ 


B Proofs for Section 3 

Proof of Theorem 3.1: In view of (2.11), the claim is equivalent to 

n X 

V-\n)C\hk) Y,{y",ihk)-EX^{hk)}) 4 Z. 

4=1 7 k=l,...,m 

Consider the vector of increments 

Y = (Yi(/ii),..., Yn{hi); Yi{h2), ..., YM; ...; Yi(/i™),..., Yn{hm)f, 
The covariance matrix of Y can be written as Rm{n) = (i?A:i,fc 2 (^))A:i,A: 2 =i,...,m) where 

( 1hiS'^)ki,k2 ■ ■ ■ T/i(I R)fci,A:2 


Rki,k2^y) • 


\ y)ki,k2 


lh{^)ki,k2 
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Let 

Zn = {Zn{hl),...,Zn{hm))'^ (B.l) 

be the centered statistic defined by 

n 

Zn{hk) = v~\n)C\hk) - EX\hk)], k = l,...,m. 


Also, let 




,{n) := disig(^Di^i{n), D 2 , 2 in),Dm,min)^ , Di^i{n) := 


ti 


Cih 


■ In ) i — 1, ..., 771, 


where In denotes an n x n identity matrix. The weak limit (3.3) can be established via charac¬ 
teristic functions. The initial manipulation of the characteristic function is very similar to that 
in Rosenblatt (1961). First note that 


exp 


I - ^y^(-Rm^(n) - icZlm(n))y|(iy = (27r)”/^ det(R^^(n) - 7cT)m(n)) c 


By a similar computation to that in Taqqu (2011), pp.42-43. 


exp 


rri 

^i'^tkZr, 


k=l 


= exp 


^ mn 

|- —2iri~^{n) E ntkC {hk)^h{d)k,k ^ ^ (f I'i'n (^)^(,mn)j I"• 


(B.2) 

j j 

k=l 1=1 

The scalars Xi^mn, ^ = 1, • • • ,mn, denote the eigenvalues (characteristic roots) of Rm{n)Dm{n) = 
PJP~^, where P G GL{mn,C), and J is in Jordan form. By the analytic expansion of log(l — 
2iTj {jI)Xinin)i 


^log(l-27?7 ^(n)Ai,mn) = 2z77 ^(n) 


{‘liY 


{n) Y. \l 


(B.3) 


1=1 


1=1 


s=2 


1=1 


However, YlT=i^i,mn = ti:{Rm{n)Dm{n)) = YYk=i'^^kC, ^{hk)lh{d)k,k- Thus, by (B.3) we can 
rewrite (B.2) as 




s=2 1=1 


(B.4) 


Moreover, 


V Yn)YKmn = V %n)ti:[{Rmin)Dmin)y] 

1 = 1 
m 

= r]~yn) Y {tr RkiM{'^)Ek2Mi'^)Ek2M{'^)Ek^Mi'^) ■ ■ ■ } 


/Ci ,...,fcs = l 
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rri 

= [tkitk2---tksC^{hki)C^{hk2)---C^{hkJ 

ki,...,ks=l 

n 

X r/“^(n) Y lh{h-i2)kiMlh{i2-h)k2M'"'yh{is-ii)ksM']- 

The weak limits (3.5), (3.7) and (3.8) are a consequence of Propositions C.l and C.2. □ 


(B.5) 


Proof of Corollary 3.1 We first show that 

nh" f jl2{hk) 

r/(n)C(h) V Ohl 


- 1 


AZ, 


(B.6) 




where A = A{9,a) and Z are as in (3.15) and Theorem 3.1, respectively. Based on (3.1), rewrite 
the left-hand side of the expression (3.3) as 


( NkOhl 

(h2ihk) 

EX\hk)\ 

\v{K)C{hk) 

1 Ohl 

Ohl ) 


k=l,...,m 


This random vector has the same asymptotic distribution as 


nh°‘ 1 

^ Owk fMhk) 


rj(n)((h) ' 

^C(wk)\ Ohl 

However, the bound (2.13) yields 


nh^ Owl 

EX\hk) 


r/(n)C(h) C,{wk) 

Ohl 


nh" / ( EX\hk) 

ri{n)C{h)\Ciu’k)^ OK 


a 

^k 




- 1 


< 


Tih^ 

-h~^ —^0, k = 1,... ,m, 


ri{n)C{h) 


(B.7) 


(B.8) 


where the zero limit is a consequence of (2.11) and (3.4). The expression (B.6) is now a 
consequence of (B.7), (B.8) and (3.3). 


To show (3.14), rewrite 

X-P = (mJmO-1mJ(Q„ - M„/3). 
By entrywise first order Taylor expansions, 


Qn - MnP = f log 


' fj‘2ihk) \ 
Ohl ) 




/ M2(^fc) 

V Ohl 


-1 


+ 0 




/^2 (^fc) 

~ihY 


-1 


(B.9) 


(B.IO) 




On the other hand, note that det(M)fM„) = (see (3.17)) is a constant with respect to n. Thus, 


(M^ = - 


ELiW/ifc -ELilogh, 

- ELl log ^k 


m 


1 

log hi 


1 

log hm 


_ J_f log^ K - log hi YJk^i log hk 

m log hi - Er=ilog/ifc 


YJk=i log^ hk - log hm YJk=i log hk 
m log hm - Z]r=i log hk 


Moreover, for j = 1, ...,m, 

m m m m 

^log^ hk - loghj ^loghfc = logh^log(r(;fc/'U'j) + log Wk log{wk/wj) 


k=l 


k=l 


k=l 


k=l 
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and mloghj — Therefore, by (2.11), we obtain the entrywise 

asymptotic equivalence 




logh 0 
0 1 


Er=iiog(^fcM) 

Er=iiog(^iM) 


Er=iiog(^fc/^m) ^ 

Er=l log(«^rnM) j 


By (B.9), (B.IO), (B.ll), (B.6), and (3.16), we arrive at (3.14). □ 


(B.ll) 


C Auxiliary results 

Lemmas C.1-C.4, stated below, are used in the proofs in Propositions C.l and C.2. The proofs 
of the lemmas can be found in Section D. 

Lemma C.l Consider 3/2 < a < 2 and s > 2, and suppose the assumptions (Al) and (A2) 
hold. Then, as n —)• oo, 


n 

C (^fci) ■ ■ ■ C (n) 'y ^ ^h{ii i2')ki,k2 ■ ■ ■ 'yh{is ii^ka,ki ^ 0. (c.l) 

|il — ^2 |<^U...U|is — ix \ <h 

Lemma C.2 Consider 3/2 < a < 2 and s > 2, and suppose the assumptions (Al) and (A2) 
hold. Then, as n ^ oo. 


n 

C^{hki)---C^{hka)v~''{n) X] lh{k-i2)ki,k2---lh{is-k)ksM 


f ... [ |xi— X2|“ . .\Xs — Xl\^ “^dxi . . .dXg. 

Jo Jo 


(C.2) 


Lemma C.3 Consider 0 < a < 3/2 and s > 3, and suppose the assumptions (Al) and (A2) 
hold. Then, as n —)■ oo, 


C H^fci)---C ^{hka)r] %n) ^ 7h{k-i2)ki,k2'"'yh{is-k)ks,ki^^- 

Lemma C.4 Suppose the assumptions (Al) and (A2) hold. Then, as n ^ oo, 

(i) in the parameter range 0 < a < 3/2, 


(C.3) 


r] 2(n)C ^(/ifcjC ^ihk 2 ) 7lik-i2)kuk2 


h,*2=l 


^^{a+l/2)^^{a+l/2) ^ ) 11 i2 (k) , 


(C.4) 


where G{y;wki,Wk 2 ), Ca and Ch are defined by (3.6), (2.1) and (2.7), respectively; 
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(n) when a = 3/2 


n 

r/“2(n)C"^(/ifci)C"H^fc2) “ ^2)fci,fc2 ^ 2r^, 

n,* 2 =i 


where r is ^it'en by (2.16). 


(C.5) 


Proposition C.l Consider the parameter range 3/2 < a < 2 and suppose the assumptions (Al)- 
(A2) hold. Then, as n ^ oo, the vector Z„ = {Zn{hi), Zn{h 2 ), ■■■Zn{hm))'^ in (B.l) converges in 
law to a Rosenblatt-like distribution whose charaeteristic function is given by 

[\.j\x,-a:,rr..lx,-x,r^dx,...d:,.}. (C.6) 

„_9 ^ do do 


Proof: Let s > 2 and consider the expression (B.5). By Lemmas C.l and C.2, as n —)■ oo the 
right-hand side of the latter converges to 


m 

fcl,... = l 




ixi-x2r-2 


Xs — xi\°‘ ‘^dxi... dxg 


... |a:i —X2|“ . .\xs — xi\°‘ “^dxi.. .dxg. 

do do 

Therefore, the characteristic function (B.4) converges to (C.6), as claimed. □ 


= (E‘^-) 

fc=i 


Proposition C.2 For 0 < a < 3/2, suppose the assumptions (Al)-(A2) hold. Let Z„ = 
{Zn{hi), Zn{h 2 ), .■.Zn{hm))'^ be the random vector in (B.l). Then, as n ^ oo, Z„ -A N{0,Ti), 
where T is a m x m matrix with eomponents 


^ki,k2 


2u;^“ ^^‘^{^)^G{y;wk^,Wk2)\\l2^^y 0 < a < 3/2; 

4r^, a = 3/2, 


(C.7) 


and G{y;wk^,Wk 2 ) is defined by (3.6). 


Proof: When 0 < a < 3/2, by Lemma C.3 it suffices to consider the term (B.5) of order s = 2. 
Therefore, by Lemma C.4, as n —)■ oo the characteristic function (B.4) converges to that of a 
multivariate normal distribution with covariance matrix S = {'Zki^k 2 )ki,k 2 =i,...,m as in (C.7). □ 


D Additional proofs 

This section contains the proofs of Lemmas C.1-C.4. 

For hm + 1 < | 2 :| < n, recall that conditions (2.14) and (2.17) can be jointly expressed as 

lh{z)kuk2 = u>kiUJk2\z\'"~^ h^{T + g{z,h)k^^k2}, \9{z,h)ki,k2\ < g(J^^ , (D.l) 
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for a general pair of indices fci, /c 2 = 1,..., m representing shifting lag values. Moreover, by the 
Cauchy-Schwarz inequality and (2.13), 

\lh{z)k^^k 2 \ < , h,zeZ, (D.2) 

where C > 0 does not depend on ki, k 2 - In particular, for a single shifting lag value 

hki = hk 2 = h{n) =: h, (D.3) 

the expressions (D.l) and (2.13) imply that 

Ihiz) :='Jhiz)!,! = h^{T + g{z,h)}, \g{z,h)\ < . (D.4) 

Thus, in the proofs of Lemmas C.1-C.4 below, we will first establish the statements for a single 
index (shifting lag value) m = I and (D.3), and then adjust the constants to obtain the general 
statements for m > 1. In the generalization it will always be implicit that where a multiple 
summation is taken over index ranges of the form |ii — ^ 2 ] > h + 1 or |ii — i 2 | < h under m = 1, 
one should substitute hm for h under m > 1. 


Proof of Lemma C.l First assume m = 1. We only look at the subcase where the summation 
is taken over the index set 


{|*i - ^ 2 ! < ^} n {\i 2 - isl > + 1} n ... n {|is -ii\ > h + l}, (D.5) 


since the remaining 2® — 2 subcases can be tackled in a similar fashion. By (D.4), we can rewrite 
the expression of interest as 


r7®(n)C®(/i) 


n 

^ lh{h - h) 


\h—h\<h, K2—••• Ms—h\>h+l 


1*2 - ^ - hr ^ {'T + 9{h - h, h)} ...{t + g{is - ii,h)}, (D.6) 

where, under the summation sign, the terms of the form r + g{-, ■) can be uniformly bounded by 
a constant, and 7 /i(ii — ^ 2 ) is bounded by C/i" (see (D.2)). Thus, the absolute value of (D.6) is 
bounded by 


C^s{l-a}h-2 


= C 


^0-1 


E 


E 


\il—i2\<h,\i2—i3\>h+l,...,\ia 



h - h 

a-2 

1 

* 

a—2 

n 


n 



—il\>h+l 


to 

1 

CO 

a-2 

is -i 2 + z 

a—2 2 

n 


n 

ns-i 


1*2—i2+2|>/l+l 


1 1 ^ 

(-Y V 

h - h 

a-2 

is-h- sign(is - i2)h 

a—2 

\n/ 

n 


n 



1*2—*3|>^+l,...,|*s—*2+2|>/l+l 

~c(^) j ...j \X2 - X3r~‘^ . . .\Xs - X2r~‘^dX2 ■ ■ - dXs, 
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which goes to zero as n —)■ oo, since a > 3/2 and by (2.11). This shows (C.l) for m = 1. In 
addition, adjusting for the constants Wk-^, 10^2 from (D.l) does not alter the zero limit. Hence, 
(C.l) also holds for m > 1. □ 


Proof of Lemma C.2 First assume m = 1. We start out by establishing that 


E 


\i]_—i2\>h+l,...,\is—ii\>h+l 


1 

to 

0-2 

k - h 

a-2 

ig ii 

1 n 


n 


n 




1 rl fl 



3^1—2:21“ ^13:2 — 3:31“ ■■■\xs — xi\^ "‘dxidx2 ■ ■ ■ dxs, n^oo. ( 0 . 7 ) 


/O 30 30 

Indeed, since 


E 


k - k 

a-2 

k - k 

a-2 

1 

* 

n 


n 


n 1 






1 rl fl 



0 30 


... \xi — X2\'^ ‘^\x2 — X2,\'^ . .\xs — xi\^ ‘^dxidx2 ■ ■ ■ dxs, n —)• 00, (D. 8 ) 

Jo 


and the sum on the left-hand side of (D.8) can be broken up into 


n 

{ E 


+ 


E 


V 

1 

to 

a-2 

k - k 

ji 

n 


n 


a-2 


\il—i2\>h+l,...,\is—ii\>h+l {\ii—i2\>h+l,...,\is—ii\>h+lY 


is - h 


n 


(D.9) 


then it suffices to show that the second summation term in (0.9) goes to zero. However, the latter 
can be established by a similar argument to that in the proof of Lemma C.l. Thus, (0.7) holds. 
Based on (D.4), recast the left-hand side of (C.2) as 




E 


H-i2|“ {t + g{ii - i2,h)} .. .\is - + g{is - ii,h)} 


ri^(n)(^(h) 

\il—i2\>h+l,...,\is—ii\>h+l 

( 0 . 10 ) 

In view of (D.7), we only need to show that the remaining terms involving at least one residual 
function g in (0.10) go to zero. Pick a number p in the interval (0,min((I, a — 3/2)). By (0.4), 
\g{h/z)\ < C^hjzf < C{h/zY, z>h + l. Therefore, 


1 




E 


k-hY .. .\is - hY g{is - h,h) 


n32’'"3s=i 

\il—i2\>h+l,...,\is—ii\>h+l 


< 


c 


n-sia-l) 


E 


In — *21“ ^ • In — n|“ ^ 


\ii—i2\>h-\-l,...,\is—h\>h-\-l 




n/ 


E 


\ii—i2\>h+l,...,\is—h\>h+l 


k - k 

a-2 

is—\ is 

a-2 

is - k 

n 


n 


n 


a—2—p 
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~C'^—^ J •••J \^i ~ ■ ■\xs-i — Xs\°^ ^|xs —xi|“ ^ ^ dxi ... dxs ^ 0, (D.ll) 

as n —7- oo. The limit in (D.ll) is a consequence of (2.11) and of the fact that the multiple integral 
is finite by the same argument as in Remark 3.1. This establishes (C.2) under (D.4). 

For m > 1, by (2.14) and (3.4) the constants Wk, k = l,...,m, in (D.IO) cancel out. 
Moreover, by (D.l) and (D.2), the zero limit in (D.ll) still holds; consequently, so does the limit 
(C.2). □ 


Proof of Lemma C.3 For m = 1, rewrite the sum in (C.3) as 

n n 

r/“^(n)C"^(/i)| X] + X] (D.12) 

- 1 - 1 


We will show that both multiple summation terms go to zero. We first show this over the index 
range \ii — ^ 2 ] < U ... U |zs — ii\ < h] moreover, as in the proof of Lemma C.l, we will only 
consider the index set (D.5). 

Fix the parameter range 0 < a < 3/2. By (D.4), (D.2) and the Cauchy-Schwarz inequality, 
the expression (D.6) is bounded in absolute value by 
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(D.13) 


z=h 


In the subranges 0<a<l, a = l, !<«< 3/2, (D.13) is bounded, respectively, by the 
expressions C(^)^/^“^, 


//i\ s/2-1 3 //ilog^(n)\ s/2-1 1 

C - log"3(n)= - 

Vn/ V n / log(n) 


and ^ all of which converge to zero as n —)■ 00 under (2.11) for s > 3. 

Next consider the case a = 3/2. By a simple adaptation of the procedure leading to (D.13), 
we arrive at the bound 
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Therefore, in the parameter range 0 < a < 3/2, by extending the conclusion to the whole 
summation range of interest, 

n 

r]~^{n)C%h) ^ Jh{ii - i2) ■ ■ ■ lh{is - h) ^ 0, n^oo. (D.14) 


We now show that the multiple summation over the index range \ii — i 2 \ > h, — ii\ > h 
in (D.12) also goes to zero. Starting from the expression (D.IO), by the same argument with the 
residual function g in the proof of Lemma C.2, it suffices to consider 




'n{n)C{h) 


E 


Ki-f2r ^1^2-ial" '^...\is-kr 


1*1—*2|>^+l,|*2—*l|>^+l 


By Cauchy-Schwarz, this expression is bounded from above by 


ri^{n)(^{h) 


E 


In -* 2 !" K2-f3r ...\is-2-is-l\ 


n.*2’'-'’»s-i=i 


|il—i2|>/i+l,...,|L-2—L-l|>/i+l 


E 


1^5-1 - 


1 2a—4 


1/2 


E 


\is - 


2a-4 


1/2 


is = l 


*S = 1 

\is—i\\>h+l 


< 


Ch 


2s 




E ^ 

2 = ^+1 


2a-4 


E 


\il-i 2 r ^1*2-fsl" ^ . |is-2 


il,i2,...,-is_l=l 

\il—i2\>h+l,..;\is-2—is-l\>h+l 


However, the multiple summation term in (D.15) is bounded by 
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Therefore, when a = 3/2, by (D.16) the expression (D.15) can be bounded by 
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as n —)■ 00 , since s > 3 and by (2.11). 

On the other hand, when 0<a<l, a = l and 1 < a < 3/2, the bound for (D.15) becomes, 
respectively. 
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and 
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(D.19) 


J^s{a+l/2) ^s/2 

as n —)• oo. These three limits hold because s > 3 and by (2.11). Thus, the expressions (D.14), 
(D.17), (D.18), and (D.19) yield (C.3) for m = 1. 

For m > 1, by (D.l) and (D.2) the zero limits in (D.14), (D.17), (D.18), and (D.19) still hold; 
consequently, so does (C.3). □ 


Proof of Lemma C.4 We begin by showing (i) for m = 1. Rewrite 
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*142 = 1 z=—n+l 




(D.20) 


As n —)• oo, the summand in (D.20) goes to, and is also bounded by, {h " 7 / 1 ( 2 ;))^^. Therefore, if 
we can show that 


Y (^"“7/*(^))^;^ ^ I|G(x)||^2(r), n^oo, 


(D.21) 


2 =—n+1 


then (C.4) is obtained as a consequence of the dominated convergence theorem. Indeed, by setting 
Wk = wi = 1 and making the change of variables hx = y in the relation (A.l), 
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Therefore, 
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where Gh denotes the covariance function of a standard fractional Gaussian noise (fCn) Y(t) = 
Bnit) — Bnit — 1 ), t G M, i.e.. 




Gh{z) := EY{t)Y{t + z) = 
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So, recast the expression on the left-hand side of (D.21) as 
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where the vanishing term o(l) is a consequence of (2.11). Since Gh{z) G L^(M) for 0 < H < 3/4 
(0 < a < 3/2; see (2.5)), the first summation on the right-hand side of (D.24) converges to 


Gh' 


G]j{z)dz = 




Gh 


\G{x)\^dx. 


(D.25) 


The equality in (D.25) is a consequence of Parseval’s theorem based on the inverse Fourier trans¬ 
form f{z) = (2Tr)~^^^ fj^e^^^f(x)dx, / G L^(M). Moreover, 
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since the function Gh{-) is bounded and by (2.11). So, by the expressions (D.24), (D.25) and 
(D.26), we obtain (D.21), and hence (C.4), for m = 1. 

For m > 1, essentially the same argument can be used, and we simply indicate the minor 
changes. The expression (D.20) must be replaced by 

r] (n)C (hfcJC X/ ^ \a+i/2 X/ '^lh{z)kT_,k2f 

In addition, in expression (D.22) one should substitute 6*1^ — l)(e*"''= 2 ?^ — 

dy for the integral Jjg |— l\‘^\y\~^°^~^^'^dy, where the former can be reinter¬ 
preted as the covariance between the increments Bnit) — Suit — Wk^) and Bnit') — Bnit' — 
t — t' = j^, of a standard fBm Bh- The rest of the argument can be applied in the same way to 
eventually arrive at the limit (D.21) with \\G{y;wk^,Wk 2 )\\‘^ 2 ^-^'^ in place of ||G(a;)||^ 2 (R)- Thus, 
(C.4) holds also for m > 1. 


To show (a) for m = 1, note that we can apply (D.2) with a = 3/2 in the summation range 
1^1 — ^ 2 ! < ^ to obtain 

n ^ 

n-^log-\n)h-* ^hin - ^ 2 ) < ^ 0, (D.27) 

\il-i2\<h 

by (2.11). Alternatively, in the summation range \ii — i 2 \ > h -|- 1, by (2.14) we have 
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Note that for /3 > 0 and large enough n, z~^dz < — l)~^dz. Conse¬ 

quently, if /3 = 1, 

log(n) - log(h -H 1) ^ E^=fe-n ^ log(ra - 1) - log(h) 

log(n) “ log(n) “ log(n) 

Thus, the left summation term in (D.28) goes to 2r^ as n ^ 00 . We now show that the remaining 
two terms in (D.28) go to zero with n. It also suffices to look at the third term in (D.28), because 
a similar approach can be used with the second term. Indeed, the former can be bounded by 
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log(n) 
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h log(n^ 


/ CXD 
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as n —)• 00 . Together with (D.27), this establishes (C.5) for m = I. 

For m > 1, by (2.14) and (3.4) the constants Wk, k = l,...,m, in (D.28) cancel out. 
Moreover, by (D.l) and (D.2), the zero limits in (D.27) and in (D.28) (for the second and third 
terms) still hold; consequently, so does the limit (C.5). □ 
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